Introduction to Googlebot spoofing
In this article, I’ll describe how and why to use Google Chrome (or Chrome Canary) to view a website as Googlebot.
We’ll set up a web browser specifically for Googlebot browsing. Using a user-agent browser extension is often close enough for SEO audits, but extra steps are needed to get as close as possible to emulating Googlebot.
Skip to “How to set up your Googlebot browser”.
Why should I view a website as Googlebot?
For many years, us technical SEOs had it easy when auditing websites, with HTML and CSS being web design’s cornerstone languages. JavaScript was generally used for embellishments (such as small animations on a webpage).
Increasingly, though, whole websites are being built with JavaScript.
Originally, web servers sent complete websites (fully rendered HTML) to web browsers. These days, many websites are rendered client-side (in the web browser itself) – whether that’s Chrome, Safari, or whatever browser a search bot uses – meaning the user’s browser and device must do the work to render a webpage.
SEO-wise, some search bots don’t render JavaScript, so won’t see webpages built using it. Especially when compared to HTML and CSS, JavaScript is very expensive to render. It uses much more of a device’s processing power — wasting the device’s battery life— and much more of Google’s, Bing’s, or any search engine’s server resource.
Even Googlebot has difficulties rendering JavaScript and delays rendering of JavaScript beyond its initial URL discovery – sometimes for days or weeks, depending on the website. When I see “Discovered – currently not indexed” for several URLs in Google Search Console’s Coverage (or Pages) section, the website is more often than not JavaScript-rendered.
Attempting to get around potential SEO issues, some websites use dynamic rendering, so each page has two versions:
- A server-side render for bots (such as Googlebot and bingbot).
- A client-side render for people using the website.
Generally, I find that this setup overcomplicates websites and creates more technical SEO issues than a server-side rendered or traditional HTML website. A mini rant here: there are exceptions, but generally, I think client-side rendered websites are a bad idea. Websites should be designed to work on the lowest common denominator of a device, with progressive enhancement (through JavaScript) used to improve the experience for people, using devices that can handle extras. This is something I will investigate further, but my anecdotal evidence suggests client-side rendered websites are generally more difficult to use for people who rely on accessibility devices such as a screen reader. There are instances where technical SEO and usability crossover.
Technical SEO is about making websites as easy as possible for search engines to crawl, render, and index (for the most relevant keywords and topics). Like it or lump it, the future of technical SEO, at least for now, includes lots of JavaScript and different webpage renders for bots and users.
Viewing a website as Googlebot means we can see discrepancies between what a person sees and what a search bot sees. What Googlebot sees doesn’t need to be identical to what a person using a browser sees, but main navigation and the content you want the page to rank for should be the same.
That’s where this article comes in. For a proper technical SEO audit, we need to see what the most common search engine sees. In most English language-speaking countries, at least, that’s Google.
Can we see exactly what Googlebot sees?
No.
Googlebot itself uses a (headless) version of the Chrome browser to render webpages. Even with the settings suggested in this article, we can never be exactly sure of what Googlebot sees. For example, no settings allow for how Googlebot processes JavaScript websites. Sometimes JavaScript breaks, so Googlebot might see something different than what was intended.
The aim is to emulate Googlebot’s mobile-first indexing as closely as possible.
When auditing, I use my Googlebot browser alongside Screaming Frog SEO Spider’s Googlebot spoofing and rendering, and Google’s own tools such as URL Inspection in Search Console (which can be automated using SEO Spider), and the render screenshot and code from the Mobile Friendly Test.
Even Google’s own publicly available tools aren’t 100% accurate in showing what Googlebot sees. But along with the Googlebot browser and SEO Spider, they can point towards issues and help with troubleshooting.
Why use a separate browser to view websites as Googlebot?
1. Convenience
Having a dedicated browser saves time. Without relying on or waiting for other tools, I get an idea of how Googlebot sees a website in seconds.
While auditing a website that served different content to browsers and Googlebot, and where issues included inconsistent server responses, I needed to switch between the default browser user-agent and Googlebot more often than usual. But constant user-agent switching using a Chrome browser extension was inefficient.
Some Googlebot-specific Chrome settings don’t save or transport between browser tabs or sessions. Some settings affect all open browser tabs. E.g., disabling JavaScript may stop websites in background tabs that rely on JavaScript from working (such as task management, social media, or email applications).
Aside from having a coder who can code a headless Chrome solution, the “Googlebot browser” setup is an easy way to spoof Googlebot.
2. Improved accuracy
Browser extensions can impact how websites look and perform. This approach keeps the number of extensions in the Googlebot browser to a minimum.
3. Forgetfulness
It’s easy to forget to switch Googlebot spoofing off between browsing sessions, which can lead to websites not working as expected. I’ve even been blocked from websites for spoofing Googlebot, and had to email them with my IP to remove the block.
For which SEO audits are a Googlebot browser useful?
The most common use-case for SEO audits is likely websites using client-side rendering or dynamic rendering. You can easily compare what Googlebot sees to what a general website visitor sees.
Even with websites that don’t use dynamic rendering, you never know what you might find by spoofing Googlebot. After over eight years auditing e-commerce websites, I’m still surprised by issues I haven’t come across before.
Example Googlebot comparisons for technical SEO and content audits:
- Is the main navigation different?
- Is Googlebot seeing the content you want indexed?
- If a website relies on JavaScript rendering, will new content be indexed promptly, or so late that its impact is reduced (e.g. for forthcoming events or new product listings)?
- Do URLs return different server responses? For example, incorrect URLs can return 200 OK for Googlebot but 404 Not Found for general website visitors.
- Is the page layout different to what the general website visitor sees? For example, I often see links as blue text on a black background when spoofing Googlebot. While machines can read such text, we want to present something that looks user-friendly to Googlebot. If it can’t render your client-side website, how will it know? (Note: a website might display as expected in Google’s cache, but that isn’t the same as what Googlebot sees.)
- Do websites redirect based on location? Googlebot mostly crawls from US-based IPs.
It depends how in-depth you want to go, but Chrome itself has many useful features for technical SEO audits. I sometimes compare its Console and Network tab data for a general visitor vs. a Googlebot visit (e.g. Googlebot might be blocked from files that are essential for page layout or are required to display certain content).